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Abstract 



We report results on the scaling properties of changes in contrast of natural 
images in different visual environments. This study confirms the existence, in 
a vast class of images, of a multiplicative process relating the variations in 
contrast seen at two different scales, as was found in pi, @|. But it also shows that 
the scaling exponents are not universal: Even if most images follow the same 
P^ . type of statistics, they do it with different values of the distribution parameters. 

Motivated by these results, we also present the analysis of a generative model 
of images that reproduces those properties and that has the correct power 
spectrum. Possible implications for visual processing are also discussed. 
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1 Introduction 

As it has been suggested a long time ago U, the early stages of the visual system 
must have evolved by adaptation to the statistics of the external stimuli. During this 
process the neurons in the visual pathway have developed their receptive fields in such 
a way that information about visual scenes is represented internally in an efficient 
way. The large amount of redundancy present in the external world is, at least in 
part, eliminated from the internal sensory code. 

In order to achieve this quasi-optimal representation the visual system must have 
learnt the regularities present in the environment where the organism lived. If a given 
conjunction of some elementary features tend to appear together, a cell responding 
optimally to the combination of features is rather likely to exist. 

Carrying out this program requires, as a first step, to perform an analysis of the 
statistical properties of the environment. In the case of visual scenes, a systematic 
study of this matter has began rather recently |2|, |[ f|, |], H; • Although the relevance 
of the second order statistics has been pointed out some time ago & M, and a gaussian 



distribution for images has very often been used [|10| , \[% \L2\ , |13f to make predictions 
on the receptive fields of cells in VI and in previous stages of the visual pathway, 
there are many reasons to believe that this statistics leaves aside a vast number of 
qualitatively important properties. An indication of this is that once the image is 
decorrelated (i.e. the correlations between pairs of pixels are eliminated) the scene 



can still be recognized [I3|, something that is mainly due to the fact that the borders 
of the objects are still present. 

As it was emphasized in || ||, |], a better understanding of the statistics of images 
should be achieved before making predictions on the visual system. In fact, an analysis 
of the properties of changes in contrast in natural scenes revealed the existence of 
multiscaling properties: images do not have uniform scale properties, but they can be 
decomposed in sets of pixels such that only those pixels in a given set have similar 
scale properties. [] Interestingly enough, these properties can be explained ||, Q by 
means of a simple model, that obtains the statistics of changes in contrast at a scale 
r in terms of an independent multiplicative process applied to the changes occurring 
a a larger scale L. 

The multiplicative process is a log-Poisson distribution. The events it generates 
represent sharp changes, or modulations, of the contrast gradient. It contains two 
parameters: the average number of events per unit of scale, s, and the strength of 
each of these elementary changes, /3. For the ensembles considered in 0, [(J these 
parameters took the values s«l. and (3 ~ 0.5. | 

The set of images used for those studies was rather uniform, they were generally 
forest scenes, and their contrast distribution was also similar from image to image. 
This fact made the statistical analysis very stable, and the existence of the multi- 
plicative relation between different scales was clear. It left however open the question 
of how much dependent those properties are on the chosen image ensemble. Their 

1 Image structure in scale-space has been considered by several authors, although from a perspec- 
tive different from ours [|l5| [Tq ]. 

2 Similar processes also occur in the physics of turbulent flows |17], [lj, [Hj, [20(] . 



possible relevance for the development of the early visual system would be greatly 
diminished if they only held in a restricted set of images. On the other hand, if the 
same statistical properties of changes in contrast could be found in a large set of 
visual scenes, their importance would be more clear. 

In this paper we address the issue of robustness of the properties of edges and 
softer textures analyzed in @, ||. m order to establish their validity we use a larger 
data set (more than a hundred times larger than the one used in our first analysis) 
and more heterogeneous images, containing many different environments. 

Our result is both surprising and encouraging. The statistical analysis can be 
performed on an image by image basis. When this is done, one finds that most of 
them (375 out of 400) exhibit the scale properties found in the smaller data set. 
Besides, the same log-Poisson model can be applied to explain the data. That is, 
for most images, contrast changes at different scales are related by the same type of 
multiplicative process. Nevertheless, the model parameter (3 depends on the image. 

We have also carried out a similar statistical analysis for ensembles of images, 
where each ensemble was characterized by the type of scenes. Again, each of the 
ensembles analyzed presents the same multiscaling properties, although with different 
values of the parameter (3. Finally, we have done the study over the whole set of 
images, and have again found that the same model is able to explain the observed 
quantities. 

Given these results, our conjecture is that the visual system has adapted, during a 
long period of time, to the existence of these multiscale properties and, in particular, 
that its architecture has captured the existence of the multiplicative process. On the 
other hand, since different visual scenes differ in the value of the parameter /3, the 
adaptation to this should occur more rapidly, and could be implemented, for instance, 
by gain control mechanisms. 

Stochastic generative models of images can be useful to understand the role played 
by the statistical properties of images in the structure of receptive fields. These models 
become very simple if it is assumed that only the second order statistics is relevant; 
in this case the contrast distribution is just a gaussian with the correct correlation 
function between pairs of pixels. But the definition of a model able to describe cor- 
rectly the statistics of changes in contrast requires a different approach. Here we have 
studied a model that reproduces the properties exhibited by the natural images ana- 
lyzed in the first part of this work. As we will see, it also possesses the correct power 
spectrum. This generative model will also allow us to see more clearly the role of the 
log-Poisson parameters on the geometrical structure of the images. 

The layout of this paper is as follows. The next section contains a brief theoretical 
background about the multiplicative log-Poisson process and multifractality of images. 
The image data sets used in this work are described in Sec. |^, as well as the criteria 
that have been used to select them. The results about the statistics of changes in 
contrast and the existence of the multiplicative process in the various data sets are 
presented in Sec. |j. Sec. |5] is devoted to the presentation and analysis of a generative 
model of images that follows the non-gaussian statistical properties discussed in the 
previous sections. The last section is devoted to discussions, where the relevance of 
this work for the visual system is briefly discussed. 



2 Multiscaling properties of images and the mul- 
tiplicative process 

In this section we give the basic theoretical background necessary for our numerical 
analysis. For further details the reader should consult the original work 0, |l| §, [?[] 

2.1 The local linear edge variance 

The contrast C(x) is defined as C(x) = I(x) — (I), where I(x) is the field of lumi- 
nosities and (I) its average value across the image ensemble. Since we are interested 
in quantifying changes of the contrast C(x) at the point x and at a given scale r, it 
is natural to consider the following quantities [|, |l| : 



Cfc,r( x ) 

and 



xi+r 



'ac(x') 

dx[ 
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dx\ (1) 

x'={x' 1 ,x 2 } 



dx' 2 (2) 



The variables e? li7 .(x) and e„,r(x) are defined at the position x and at a scale r. While 
the first takes contributions from edges transverse to a horizontal segment of size r, 
the second accumulates changes in contrast transverse to a vertical segment of the 
same size. 

These variables have been interpreted as the local linear edge variance along the 
corresponding direction and at the scale r. They describe how far the changes in con- 
trast are from being uniformly distributed in a segment, either horizontal or vertical, 
of size r. If these changes were constant, the estimators would be independent of r 
and, on the contrary, a dependence on r indicates the existence of geometrical struc- 
ture. Although in this work we will discriminate between the vertical and horizontal 
directions, we mention that the behaviour of contrast changes at a scale r can also 
be studied by means of a bi-dimensional integral : 

e r (x) = — / dx[ / dx' 2 \ VC(x') | . (3) 



r ./asi-j Jx 2 -^ 



Similarly to the other two variables, e r (x) quantifies the deviation from uniform of 
the distribution of |VC|. 

2.2 The multiplicative process 

The question now arises of how the statistics of ej^(x ) (where j = h, v) at the scale 
I is related to the distribution of the variables at a larger scale L. This was discussed 
in [||, |5|, H for the marginal distribution of e^ r . The answer is simple: they are related 



by a multiplicative process. Denoting by an its associated stochastic variable, this 
means that 

e jt i = an L e jiL (4) 

The random variable an, is independent of e^. The symbol = indicates that the 
equality holds in the distributional sense, that is, both sides of the equality have the 
same distribution. This relation implies a linear relation between the logarithms of 



the variables at two different scales. This property has also been discussed in [[21 
although the existence of a multiplicative process was not noticed. 

The process for arbitrary changes in scale was derived in || where it was shown 
that the random variable an, follows a log-Poisson process [|18|. This process can be 



justified as follows. As the scale is gradually reduced from L to I some changes in 
contrast are left outside the segment of size L. But some of these changes are special 
in that they give rise to a singular behaviour of the derivatives of the contrast, and in 
turn to a discontinuity in the e's, which acquire a factor (3 each time that this occurs 
(0 < (3 < 1). We will refer to this effect by saying that a modulation has occurred. 
It will be assumed that: 

• modulations are independent events. The average number of them contained in 
the change of scale (InL — In/)) is denoted by s, 

• the image ensemble is translational invariant, 

• the multiplicative process is scale invariant. 

Under a finite change from L to I, there will be n modulations with probability 
p n that, given the first assumption, is Poisson: 

„n 

Pa = -re" s (5) 

n\ 

The value of an after n modulations is proportional to (3 n : an = f3 n M(l, L). The fact 

that the proportionality constant is not one can be understood by noticing that if no 

modulations occur (i.e., n — 0), e,y keeps as much singular structure in the interval I 

as that present in e^L, which is distributed in the larger interval L. 

The dependence of M(l, L) with the scales I and L can be obtained by invoking 

a scale invariant multiplicative process. In this case, it can only depend on the ratio 

l/L, that is M(l,L) = M(4-). A standard argument shows that it has to be a power 

law: if the change from L to I is done through an intermediate scale r, then because 

of the multiplicative character of the process we have M(jr) = M(^)M(^), which 

can only be satisfied if M(j) — (r) ■ The exponent A is another parameter of 
the model. 

Taking all these arguments into account, the distribution of In an can be expressed 
as: 

00 s n ( I ' L\\ 

p aiL (lnan) = e~ s J2 — 5 (lnan - n\nf3 - A\n(-)) . (6) 

n=0 n - ^ \ * / / 



Up to now we have considered s, (5 and A as independent parameters. However 
translational invariance fixes one of them. In fact, from the definition of the e^'s one 
easily checks that its average over a translational invariant ensemble of images does 
not depend on r. In turn, taking the average on both sides of eq. @, this implies that 

(otl) = 1 , (7) 

where (...) indicates the average over an ensemble of images. But then, imposing this 
condition over the value of this average obtained from the distribution in eq. (|6]) one 
has: 

A L 

In- , 8 



1-/3 r 

from where the average number of modulations per unit of change in scale is s = j—^. 
The existence of a multiplicative process has direct consequences on the scaling 
properties of the moments of e^ r . Let us denote these moments by (e? r ). If the log- 
Poisson model holds, then from eq. (§) it is easy to check that the moments have a 
property called Self- Similarity (SS), 

(4r> = A>^ , (9) 

where the r p 's are the SS exponents. Notice that this also implies that any moment 
can be expressed as a power of the moment of order q. Choosing q = 2 this means 
that 

<&> = ^2)[<a] p(p ' 2) , (io) 

This relation could hold even when SS is not true. It is called Extended Self- Similarity 
(ESS) P3| . The p(p, 2)'s are the ESS exponents and the A(p, 2)'s are geometrical fac- 
tors. The exponents p(p, 2) can be predicted using the distribution of the multiplica- 
tive process, eq. (||), to evaluate the moments of order p in eq. (^). This computation 
yields 

V 1 - P 

oM - T^T3 ~ TT=W ■ (U) 

Let us notice that although the model has two parameters, (3 and s, the ESS exponents 
p(p, 2) depend only on the modulation parameter f3. There is a simple relation between 
r p and p(p, 2): 

r p = -s(l - (3) 2 p(p,2) , (12) 

Conversely, it can be proven that if SS and ESS hold and the exponents p(p, 2) verify 
eq. (|TT|) , then e r can be described in terms of a multiplicative process (eq. (§)) of 
the log-Poisson type (eq. (]$))■ It is then enough to check eq. (|TT|) , from where the 
existence of a log-Poisson process is derived. 



2.3 Multifractality of images 

The process just described has an interesting geometrical interpretation, as has been 
discussed in ||. In fact, the power law behaviour of the moments (e^ r ), given in 
eq. (||p, can be traced back to the existence of a very irregular behaviour of e r (x), 
which can be expressed as: 

€r(x) = a(x) r m , (13) 

where the exponent h(x) depends on the site of the image. This property is, in turn, 
related to a singular behaviour of |VC|(x): h < indicates a divergence of |VC|(x) 
to infinity, while h > indicates finiteness and continuous behavior. The greater the 
exponent the smoother is the the change in contrast around that point. Eq. ( |l"3"D 
expresses that all the points are singular (in this wide sense) , and that the singularity 
exponent is not uniform. This property allows to classify the pixels in a given image: 
the set of points with a singularity exponent contained in the interval [h — Ah, h + Ah] 
(where A/, is a small positive number) define a class Fh- These classes are the fractal 
components of the image (the notion of fractality is discussed in |23], 24|). The smaller 



the exponent the more singular is the class, and the most singular component is the 
one with the smallest value of h. A mathematical object with this structure is said 
to be a multifractal, a concept originally introduced in the context of fully developed 
turbulence |22| . 



The irregular arrangement of pixels in a fractal component Fh can be characterized 
by counting the number of pixels contained inside a given ball of radius r, N r (h, Ah). 
As r — ► it is verified that 

N r {h,A h ) ~ r D{h) . (14) 

This exponent D(h) quantifies the size of the set of pixels with singularity h as the 
image is covered with small balls of radius r. It is the fractal dimension of the associ- 
ated fractal component Fh, and the function D(h) is called the singularity spectrum 
of the multifractal f22fl . 



There is an important connection between the local singularity analysis and the 
statistical description of the image in terms of the moments (e^ r ). In fact, the ESS 
exponents r p are the Legendre transform of the singularity spectrum D(h) ([{22[|, see 
also §), 

D(h) = min{ph + d — r p } , (15) 

where d = 2 is the dimensionality of the images. Given the r p 's, eqs. ( |i~2|) and (|TT|), 
the dimension spectrum can be predicted: 



D ^- D -~ h -^ 



i-h/ h+A 



(d-BJln/V ' (16) 

where D^ is defined as D^ = d — s. From here one notices that there is a minimum 
value of the singularity exponent h, given by h = —A which then is the singularity 
of the most singular fractal component. Its fractal dimension can be computed from 
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eq. ( |T6"D which yields D(—A) = D^. Then D^ is the dimension of the most singular 
set; it can be expressed in terms of the parameters of the model, e.g. and A, as 

d-D^^s^j— . (17) 

The points with the strongest singularity (h = —A) are those with the sharpest 
changes in contrast (edges). This can be explicitly verified by evaluating the singular- 
ity exponents of all the pixels in the image and extracting those with the minimum 
value of h (within a resolution Ah) @. Let us remark that the most singular points 
could have a rather weak singularity, but even in these cases they are the points where 
the most important changes in contrast occur. This happens, in particular, when there 
are no modulations (s = or, equivalently, Doo = 2), in which case A = 0. 

The parameter 8 was defined as the strengh of a modulation in the multiplicative 
process, but it can also be interpreted in terms of the geometrical properties of the 
edges in the image. In fact, eq. ( |TTD relates 8 with D^ and A, which describe proper- 
ties of the set of pixels where the changes in contrast are most singular: 8 = „ „ +1. 
For fixed D^, as 8 decreases, the singularity exponent of the edges, —A, decreases. 
In order to gain some intuition about this relation, let us consider an isolated jump in 
contrast. Its dimensionality is D^ = 1, and it can be proved [] that A = 1, what gives 
= 0. By contrast, images with statistics dominated by smooth changes of contrast 
have 8=1. This is so because A = 0, but this result can also be understood by 
noticing that the way to obtain smooth images with the log-Poisson process is to set 

= 1- 

The measurable quantities are the moments (e^ r ) , and from them one can obtain 

the SS exponents r p and the ESS exponents p(p,2). Given the experimental ESS 
exponents p(p,2), it is easy to obtain the value of 8 by performing a least squares 
regression on eq. ([TT]) • T° define the log-Poisson process completely one should still 
estimate D m = 2 — s, which can be done for instance using the measured value of r^- 
This can be seen from eq. ([12]) which after setting p = 2 yields 

D ~= 2 -(i^V • (18) 

Let us remark that this expression contains a dependence in 1/(1 — 8) 2 which makes 
the propagation of errors important as 8 approaches the value one, even if the uncer- 
tainty in the values of 8 and T2 are small. 



3 Choice of environments 

In the present study we used a data set of 200 images, having 1536 x 1024 pix- 
els and 16 bits in luminance depth. These images have been kindly provided to us 



by Hans van Hateren [28[ and were selected from his data set of about 4200 im- 
ages. The selected images can be observed and downloaded from the URL address 
[http://hlab.phys.rug.nl/archive.htmll . A list of them is provided in Table |l|. The files 



3 see, e.g., [§ 



are originally named as "IMK^.IMC". In the table only the number (#) of the cor- 
responding image is specified. 

To compare with previous studies we notice that the images used in || contained 
45 natural scenes with a size of 256 x 256 pixels and a luminance depth of 13 bits. 
This means that our present statistical database has 130 times as many bits as the 
previous one. 

The data set has been considered either complete or divided into four different 
subsets with 50 images in each of them. These four ensembles were selected to meet 
the following requirements: 

• Ensemble A: 

It contains natural scenes of trees and woods. The images do not have artificial 
objects or open skies. The woods are dense and neither shadows nor direct light 
rays are allowed. 

• Ensemble B: 

It contains natural scenes. The rest of conditions imposed in the definition of 
ensemble A have not been required here. 

• Ensemble C: 

Images containing both artificial and natural objects are included. We chose 
images that in a visual scan appear to have predominantly horizontal struc- 
tures, i.e, the horizontal edges seem to be longer than the vertical ones. We are 
interested in testing whether this implies a difference between the horizontal 
and vertical statistics. 

• Ensemble D: 



Out of the whole database of 4212 images in [28[ we picked up 50 randomly. 
No image belonging to any of the other three ensembles was included here. The 
intention is to have a data set as varied as possible. 

Each ensemble is used to study the vertical and the horizontal statistics (that is 
the marginal statistics of e^ ir (x) and e 1 , ir (x), respectively), what gives a total of eight 
different elements. 



4 Results 

The main purpose of this section is to establish whether the statistical properties 
of SS and ESS, eqs.(|S|, [TU]) , are present in the ensembles considered, and whether 
the ESS exponents can be predicted by the log-Poisson model, eq.@. This will be 
done taking three different categories of ensembles. In Sec. [4.2| we consider the four 



ensembles defined in Sec. Q for the horizontal and vertical directions. Next, in Sec. 
|4.3| we take as ensembles the individual images themselves, again for both directions. 
Finally in Sec. ^^ we regard the 200 images as a single ensemble and we accumulate 
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Ensemble 








Image 


number 






A 


0034 


0211 


0263 


0478 


0586 


0605 


0662 


0683 


0801 


0808 


0881 


1017 


1031 


1164 


1406 


2035 


2263 


2280 


2411 


2417 


2603 


2605 


2606 


2626 


2649 


2935 


3002 


3134 


3491 


3514 


3536 


3789 


3807 


3830 


3842 


3940 


4010 


4031 


4037 


4042 


4044 


4056 


4069 


4078 


4094 


4098 


4101 


4123 


4133 


4134 


B 


0037 


0071 


0222 


0265 


0511 


0758 


0774 


0807 


0836 


0877 


1206 


1341 


1375 


1814 


1831 


1836 


1852 


1867 


1889 


1926 


1976 


1992 


2071 


2159 


2214 


2409 


2437 


2619 


2623 


2805 


2806 


3236 


3269 


3318 


3414 


3415 


3611 


3615 


3635 


3913 


4028 


4066 


4087 


4089 


4107 


4121 


4122 


4139 


4184 


4193 


C 


0001 


0038 


0046 


0052 


0085 


0090 


0127 


0145 


0146 


0147 


0168 


0177 


0210 


0403 


0423 


0459 


1282 


1357 


1400 


1411 


1423 


1427 


1431 


1432 


1471 


1489 


1529 


1562 


1566 


1600 


1878 


1879 


1934 


1946 


1950 


1964 


3029 


3054 


3065 


3082 


3099 


3137 


3154 


3179 


3223 


3225 


3286 


3322 


3332 


3384 


D 


0288 


0325 


0342 


0446 


0609 


0668 


0685 


0695 


0850 


0879 


0914 


0920 


1092 


1171 


1191 


1201 


1249 


1322 


1414 


1422 


1462 


1645 


1691 


1775 


1945 


2231 


2235 


2499 


2582 


2749 


2750 


2770 


3079 


3123 


3181 


3214 


3356 


3474 


3496 


3540 


3546 


3680 


3752 


3784 


3786 


3894 


3952 


3988 


4058 


4086 



Table 1: The 200 selected images and its classification in the four ensembles A-D 

the horizontal and vertical statistics in the same distribution. Notice that the moments 
are self-averaging quantities but the ESS exponents, and therefore the parameter f3 
of the probability distribution of eq. (0) are not, so we do not expect the parameter 
(3 of an ensemble to be the average of the /3's of its images. 

4.1 Methods 

Here we describe the methods used to check the presence of SS, ESS and the log- 
Poisson model for given p-moments of e r . The statistical moments will be computed 
over the three different categories we are going to analyze, i.e., for each of the eight 
orientational ensembles, for single images and for the whole data set. The items of 
a category giving rise to the different moments (e£) will be referred to as elements. 
(Hence, we have eight elements in the first category, 400 in the second and just one 
in the third). 

The existence of SS will be checked by verifying that In < e^ r > is linear in lnr 
for each element (the symbol < . > denotes the ensemble average over the images 
belonging to that element). In order to present the results for different elements in a 
single graph and to evaluate the standard deviation of the data, we have proceeded 
as follows. Since each element has its own scaling exponents the curves have to be 
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normalized; we first define, 



S p (x) = In < e\ r > , (19) 

where x = lnr. When S p (x) vs x is fitted by a least-squares regression it yields the 
fitting line []: S p (x) = a + bx. We now perform a linear transformation 5 P — > S' such 
that S'= x for each ensemble. The same transformation is applied to S p (x) to give 



S p (x) - S p \x) = ° pyX [ . (20) 



S„(x) — a 
T 

The average over different S p of a given category can now be carried out, yielding 
the quantities SL ESS is analyzed in a similar way; now we define 

E p (y) = In < e\ r > (21) 

where y = In < e| r >. The fitting line for each element is expressed as E p (y) = c+d y. 
However now not only the fit parameters c and d depend on the element considered, 
but also the variable y, for fixed r, varies from element to element. Therefore, before 
transforming all the fitting lines onto a common function, one has to normalize y 
properly. This is done as: 

/ V Drain /no\ 

y^y= (22) 

Vmax Vmin 

where y m ax and y m in are respectively the maximum and minimum values of y for 
a given element. This maps y into the interval [0:1]. As before we apply to E p the 
transformation that yields E p — > E' = y'\ 

Ep ^E p (y) = —- _ . (23) 

u \ymax ymin) 

The experimental ESS exponents p(p, 2) for each element in the category are fitted 
by a least-squares regression. An important result is that each element follows the 
log-Poisson model although the optimal value of /3 varies from element to element. 
To represent the data from all the elements together we define a normalized p'(p, 2) 
as: 

p'(p,2) = ^4p*(p,2) (24) 

where p(p, 2) denotes the computed ESS exponents [] obtained from each element, p@ 
denotes the exponents predicted by the log-Poisson model (0) using the value of (3 

4 Notice that the fitting parameter b gives an estimate of the SS exponent t p . The linear regression 
is performed in the range r = [8 : 25], where the SS property is best realized 

5 These exponents are obtained with a linear regression of the ESS curves for each moment. This 
regression is performed in the range r = [8 : 25], where also the ESS property is best realized. The 
function p{p 1 2) is computed up to p = 7. For higher order moments the properties of SS and ESS 
begin to appear less clearly. The reason for this is that the main contribution to these moments 
comes from the tail of the distribution of e r> i, where the sampling is worse. 
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that best fits the data and (f°{p,2) are the exponents predicted by the log-Poisson 
model with a certain /3 conventionally set equal to 0.5. (i.e., half of its maximum 
value) . 

4.2 Ensembles 

We study the properties of the four ensembles in the two directions, which gives 
us eight elements altogether. Figs. |l|la and |l|lb show the mean S' Ax) and the standard 
deviation a s ii x \ for p = 2 and 7. It is observed that SS holds rather well, although 
there appear small deviations for the lowest and highest values of x. 

The values of E' p (y') are plotted together for the eight elements as a function of 
y'. This time we cannot average over the values of E' p (y'), since y' varies from element 
to element. Figs. [IJIa and |I|IIb for p = 2 and 7, respectively, show that ESS holds 
better than SS. 

Fig. |2] represents the mean value p'(p) over the eight ensembles as a function of 
p and its standard deviation. It is observed that the log-Poisson model is very well 
satisfied. In Table |], the observed values of (3 are given. There are important numerical 
differences among them. As we will see in the next section this also occurs when the 
images are analyzed individually. 



Ensemble and orientation 


P 


A horizontal 


0.17 


A vertical 


0.022 


B horizontal 


0.27 


B vertical 


0.14 


C horizontal 


0.10 


C vertical 


0.10 


D horizontal 


0.28 


D vertical 


0.030 



Table 2: Values of (3 for the four ensembles A-D and two orientations 

In most of the entries of Table |2| there is an anisotropy between the observed 
values of /3 along the horizontal and vertical directions. For instance, in ensemble A 
the horizontal (3 is much larger than the vertical one. A difference between the two /3's 
is still present in ensemble B, although it is smaller. Both are dominated by vertical 
statistics (they are made up of images of woods), which somehow is reflected in the 
different values observed for (3 along each direction. However, ensemble B is more 
isotropic. This can be due to the existence in this ensemble of structures not present 
in A such as shadows, open skies and clouds, and to the fact that the woods are not 
as dense as in A. 

Naively, one would expect a difference of the opposite sign in ensemble C, which 
was defined to be dominated by horizontal statistics. However, this particular en- 
semble reveals to be the most isotropic of the four. This apparent contradiction is 
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Figure 1: Test of SS and ESS for the eight ensembles. I) SS for the moments of 
order 2 (a) and 7 (b). The dashed line is f(x) = x. The diamonds represent the mean 
and the (small) bars the standard deviation over the eight elements. II) ESS for the 
moments of order 3 (a) and 7 (b). The dotted line is f{y') = y' 



explained by the fact that even when changes in contrast along the horizontal direc- 
tion are normally smooth, they present very localized fluctuations that contribute to 
€h, r ■ In fact , a closer inspection of the images in ensemble C shows that although the 
contrast is more correlated along the horizontal direction, the edges appear isotropi- 
cally distributed. Finally, ensemble D behaves similarly to A, except for the numerical 
values of j3. 

The values of (3 in Table |2| also inform of the singularity exponents of the edges, 
—A (cf. end of Sec. Q2.3|) ). For instance, and assuming that Doo ~ 1, the small values 
of j3 in the table indicate that the corresponding statistics is dominated by sharp 
contrast changes. This effect is more noticeable in the vertical variable than in the 
horizontal one. 
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Figure 2: Test of the log-Poisson model for the eight ensembles. The diamonds rep- 
resent the mean of p'(p, 2) over the eight ensembles as a function of p and the error 
bars are twice the standard deviation. These error bars are small and difficult to 
appreciate. The dashed line represents p°' 5 {p, 2), the log-Poisson p(p, 2) with (3 = 0.5. 

4.3 Single images 

Again, our aim is to establish whether the statistical properties of SS and ESS 
are present but this time in single images, and whether the ESS exponents can be 
predicted by a log-Poisson model for each of them. Now the moments < e p rl > are 
obtained by averaging over the pixels of each individual image. Note that the result- 
ing exponents p(p, 2) of the images of a certain ensemble do not necessarily average 
to the ensemble exponents. This is checked for the set of 200 images for both the 
horizontal and vertical direction. From now on we shall refer to this set as containing 
400 "images", implying that we consider the statistics from the horizontal and the 
vertical directions. 

The results for SS, ESS and the ESS exponents are presented applying the pro- 
cedures described in Sec. fO|. Fig. D) shows the data for SS and ESS while Fig. £| 



includes the verification that the log-Poisson model is satisfied and the corresponding 
distribution of values of (3 and D^. 

The means S' p and p(p, 2) are not over the 400 images but over the 375 images that 
best satisfy the log-Poisson model. Of the other 25 images (6.2%), 6 of them (1.5%), 
have been discarded because their /3 exceeds 1, the maximum admissible one. The rest 
have been discarded because the log-Poisson model fits their p(p, 2) significantly less 
accurately than that of the 375 selected images. This can be seen in Figs. f|Ia and |4]Ib 
The 375 images with smallest x 2 — S«=i (pG°? 2) — p^Qo, 2)) are a homogeneous set 
in the sense that discarding a few of the worst of them produces a small improvement 
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of the overall fit represented in Fig. [|Ib. On the contrary, discarding the 25 worst 
images of the complete set produces a significant improvement of the overall fit. The 
discarded images have some features in common. Almost all of them either are obscure 
and therefore lack definition or are scenes with little structure: sea, rivers, ground with 
little more than grass, buildings with simple repeating patterns. 

Fig. ^Ila shows that the probability of (3 decreases approximately linearly with (3. 
The corresponding histograms for the eight ensembles considered in Sec. ( |4.2|) have 
a similar shape, though more irregular. They extend also over the whole range [0:1]. 
The histograms of the vertical ensembles are peaked at a lower value of /3 (~ 0.05) 
than the horizontal ones (~ 0.25), except for ensemble C, which is in accordance with 
the global values of (3 for these ensembles, shown in Table [2j. 

The fractal dimension of the set of edges, D^, can be estimated from eq.flllf) by 
measuring (3 and r 2 . As has already been mentioned in section (|2.3|) , the evaluation 
of this fractal dimension becomes difficult for large /3. In fact, D^ has unrealistic 
negative values for about 23% of the images which satisfy the log-Poisson model (all 
of them have (3 > 0.5). This can be explained by the fact that, for large /3, Eg. (|T8|) 
propagates errors in the estimation of j3 and r 2 as very large errors in the estimation 
of .Doo, due to the factor qzW- Furthermore, even if errors in (3 are not biased, eq.flPSp 
gives an estimation of D^ biased towards lower values 0, which may explain why, for 
some images, the dimension of the most singular fractal component (the set of edges) 
is negative. 

The result is shown in Fig. [|Hb. Notice that it is consistent with the expectation 
that the fractal dimension of the set of pixels with the largest changes in contrast 
should be close to one. Even if this histogram seems to be rather broad, a visual 
inspection of the fractal component corresponding to an image with, say, D^ = 1.3 
looks like a not too thick line. 

4.4 Whole data set 

We now perform the same analysis on the global data set containing the whole set 
of 200 images. These are again treated as 400 "images" by averaging together the 
moments for the horizontal and vertical variables e. The results are presented in 



terms of the same quantities (eqs. (|20D to (||])) used in Sec. [O] to allow for direct 
comparison. The value of/? obtained from this set is 0.17. Fig. [5] shows the SS test for 
the moments of order two and seven and the ESS test for the moments of order three 
and seven. The fit of the ESS exponents with a log-Poisson curve is presented in Fig. 
H (Notice that, following the convention defined in eq. (|23]) the curve is referred to 
(3q = 0.5). It can be seen that also for the whole data set all these properties are well 
confirmed. Again ESS holds better than SS. 



6 This can be seen by calculating the propagated error in Doo to second order in the error in /3. 

(I- 2 ?)* I 2 * 5 / 3 + TT^y 1 



For large enough (3 this order becomes relevant: SDoo = /i Zasa [2(5/3 + (1 \) (S/3) 2 
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Figure 3: Test of SS and ESS for the 375 out of 400 chosen images. I) SS for the 
moments of order 2 (a) and 7 (b). The dashed line is f(x) = x. II) ESS for the 
moments of order 3 (a) and 7 (b). The dashed line is f(y') = y' 



5 Generative model of images 

The fact that the SS exponents t p do not behave linearly with p implies that 
scale-invariance of natural images is rather subtle. It is related to the fact that for a 
given image not all the points transform in the same way under scale transformations 
0. As was shown in Sec. Q2.3J ), the log-Poisson model, which explains the non-linear 
behaviour of t p , also predicts the existence of a multifractal structure: the image pixels 
can be decomposed in fractal components such that points in the same component 
have the same scaling properties. 

Generative models of natural images based on their scale properties should take 



this important property into account. The intuitive model proposed in |29| puts the 
emphasis in reproducing the correct power spectrum, but it does it in such a way 
that all the points have the same scale properties. This is because all the changes in 
contrast are sharp, and equally singular, what gives rise to a single fractal component. 
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An evaluation of the SS exponents t p would show that these are linear in p. 

Here we address the issue of how to define a simple stochastic model for image 
generation which takes into account the properties discussed in the previous sections 
and that has the correct power spectrum. The model described below is based on a 
wavelet expansion of the contrast C(x) itself. Although generative models of images 
based on a wavelet expansion have been proposed before (see, e.g., |Kj), the existence 
of a multiplicative process associated with the wavelet coefficients was noticed more 
recently in ||. According to this result, the multifractal structure described in section 
Q2.3|) is produced through a convenient stochastic process for the wavelet coefficients. 

We will express the contrast C{x) of the image in terms of a dyadic wavelet set 
p3| ^a%{%) = ^(2^x — k), where the scale r is given by r = 2~ J (j G Z), and the 

space is sampled at the points xq = 2~^k, with k = (ki,k 2 ) (fci and k 2 are integer 
numbers taking values from to 2 3 ' — 1). 

C(x) = E E 7 ijfe -%(*) • (25) 

j=o fe 1 ,fc 2 =o 

where N is the number of scales considered; thus, the number of pixels of the image 
is 2 N x 2 N (each change in scale is of size 2). Synthetic images will be generated with 
N = 8, which gives rise to images with 256 x 256 pixels. The mother wavelet \J/(af) 
will be taken as the Laplacian of a gaussian with zero mean and standard deviation 
one. 

The wavelet coefficients 7. g are then generated following a hierarchical procedure 
which starts at the scale with the poorest resolution (j = 0), which is described by 
the mother wavelet. Its coefficient, 7 0( j, will appear just as a normalization of the 
contrast level. It is then used to generate the coefficients 7^ (kx,k 2 = 0,1) of the 
wavelets located at the four points used to sample the next scale (j = 1). They are 
of the form r y 1 % = «ifc7oo' where the a^'s are independent, identically distributed 
random variables which follow a given probability distribution (to be given below). 
This operation is repeated recursively at every new scale. At the scale j — 1 each 
7j_i u generates, by multiplication by the corresponding independent a's, four new 
coefficients at the scale j: 

1 . , 



Ijk ~~~ 2 i k '.?'-l,| 



2' 



where [|] means the vector with components equal to the integer part, rounding down, 
of those of k divided by 2. 

The projections of such C(x) over appropriate wavelets of size r possess the prop- 
erty of SS. Let us consider first the dual wavelet \P (that is, (\l/ -^ -,p) = 2~ 2: >5jji5^,). 
Defining the wavelet projection T^C(x) as: 

7 Although in a different context, the model has been used before in Jul to study synthetic 
turbulence. 

8 The interesting question of how to choose a convenient mother wavelet for a given image ensemble 
is not addressed here. This problem is discussed in p3| 
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TI,C(x) = I J dy n^-)C(y) (27) 



It follows from eq. (25) that: 



(\TlCn = A$r* (28) 



where 



r„ c = p - log 2 aP , (29) 



where aP is the moment of order p of the distribution of a. Then, there exists at least 



the wavelet \l/ for which the wavelet projections of C have SS, eq. fl28|) (in fact, all 
the wavelets vanishing the same number of moments define wavelet projections of C 
with SS of the same exponents rf 7 ). As shown in ||, this implies that also &? verifies 
SS (i.e. it follows eq. (§)) and there is a simple relation between the SS exponents r p 
and r p given by: 

r p = r p c - p (30) 

and using eq. (p9|) we obtain r p = — log 2 «p. It follows that in order to obtain SS 
exponents t p according to the log-Poisson model the random the a.g are computed 
according the log-Poisson distribution eq. (^) where the ratio - is 2. 

We have still to check that this model has the correct second order statistics ||. 
Using a 512 x 512 image produced with the generative model we have verified that 
the power spectrum is indeed of the form S(f) ~ 1/ f 2 , for an appropriately chosen 
\&'s. The plot and the fit are exhibited in Fig. |7|. 

5.1 Geometrical interpretation of the generative model 

Table [5] presents a catalogue of images produced with the generative model for 
different values of the parameters (3 and s. The images have been generated in such 
a way that those with identical s (shown in the same column) contains the same 
number of modulations for each a-r. That is, a given site jk has a random choice 
of n-t modulations which is the same for all the images in the column. There is 
also a continuous change for fixed j3 as s varies, which is obtained keeping fixed the 
cumulative probabilities used to generate the number of modulations at a given site. 

The geometrical characteristics of the images are clear from visual inspection of the 
figures in Table §. The images in the column s = are all the same, as the possibility 
of having even one modulation is neglected. These images are also obtained as the 
limit /3 — > 1 (this is because for (3 = 1 the modulations become irrelevant, as they do 
not change the value of 7 ■£) . These two cases should correspond to a uniform contrast 
field (the non-uniformity in the borders is due to finite size effects). For non-trivial 
values of s and (3 the following properties are observed: 

• Dao is directly related to s by eq. ([l7|) : Doo = 2 — s. Therefore, for fixed (3, as s 
increases D^ decreases, as can be seen in Table [3] going from left to right. 



• As (3 increases at fixed s, A decreases. Visually it can be appreciated that the 
singularity exponents decrease in Table |3| from top to bottom. 

5.2 Numerical analysis of the generative model 

The artificial images are constructed to have a certain theoretical /3 and s. We study 
here the properties of 32 generated images of 256 x 256 pixels with (3 = 0.40. The 
histogram in Fig. || shows the distribution of the 64 computed (3's for the horizontal 
and vertical direction of each of the 32 images. The mean of the histogram is (3 = 0.31 
with standard deviation op = 0.13. The fluctuations around the mean and its bias 
with respect to the theoretical (3 are mainly due to the finite size of the images and 
not to the size of the data set. To verify this we have randomly divided the data set 
in two sets of 12 and 20 images and have checked that neither their means nor their 
standard deviations change appreciably. 

On the other hand, averaging the moments of the 64 cases (which approximates 
very well the moments computed over a single 2048 x 2048 image) yields (3 = 0.38 
(Fig. |9|a), rather close to the theoretical 0.40. 

What is interesting is that even when these fluctuations in f3 are due to finite 
size effects the multiplicative process for each image is preserved. This can be seen 
in Fig. ||b where the mean and variance of the normalized ESS exponent p'(p,2) 
(averaged over the 64 cases) is shown. 

One can wonder whether the fluctuations in /3 observed in the true images could 
be thought of as finite size effects. This does not seem to be the case. As we have seen 



in Sec. [4.4| even the whole set of images follows the log-Poisson model with a given (3. 
Considering this set as a single large image, each of its pieces (i.e. the single images) 
should follow the log-Poisson model with /3's distributed as in Fig. |8|. However, if the 
fluctuations of the artificial images were a good model for the fluctuations of the true 
ones, then the dispersion of the histogram in Fig. |8| should be larger than that of the 
histogram in Fig. f§III (because the synthetic images are smaller than the true ones). 

6 Discussion 

One of the main conclusions of this work is that the multiscaling properties of changes 
in contrast are robust in natural scenes: Both SS and ESS are present in almost every 
single image considered, which are of very different type. In addition, a vast number 
of the images in our data set (94%) can be described in terms of a multiplicative 
process of the log-Poisson type. The modulation parameter (3 is distributed over its 
admissible range, although the probability distribution decreases with increasing j3. 
Four different ensembles of images (described in Sec. ^) were considered in order to 
understand how the presence or absence of certain features affects the edge statistics. 
The moments of the relevant variables e r were then computed by averaging over all the 
images in the ensemble. It was found that SS, ESS and the same multiplicative process 
are also present in each of the four ensembles and for both vertical and horizontal 
statistics. Again, these eight cases differ only in the value of (3. 
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The same properties (SS, ESS and log-Poisson) hold when the whole data set is 
taken as the ensemble. We can conclude from our experimental study of the three 
categories of ensembles that SS, ESS and the log-Poisson model are robust features 
of natural images. It is very remarkable that ensembles following log-Poisson laws 
with different values of the parameter f3 can be combined together giving again a 
log-Poisson model. 

One point that we want to emphasize is that the existence of a multiplicative 
process of a given type (the log-Poisson model) is very robust. It is present in images 
with rather different aspect, what supports the expectation that it has actually been 
detected by the early visual system in the course of evolution. If so, it should be 
contained in the structure of the receptive fields. On the other hand, there is a high 
variability in the value of the model parameter (3. Then, when the image ensemble 
changes, the cell should adapt to the new value of /3. This is not an unusual situation 
in the visual system: it adapts, for instance, to the average contrast 



The generative model studied in this paper provides a simple way to produce 
images with the multiscaling properties of natural scenes. It could be used as a better 
description of the contrast distribution that contains not only the correct second 
order statistics but also a correct description of the non-gaussian aspects of changes 
in contrast. The model could then be used to study the consequences that these 
properties have on the early stages of the visual system |32 |. 



The model is not based on objects. A quick glance to Table [3] shows cloudy struc- 
tures that have little to do with natural images (except e.g. those of the sky). In 
particular their appearance is rather fractionary, lacking real contours. However the 
model could provide an indication of which aspects of the visual processing depend 
only on the multiscaling properties and which ones are more dependent on spatial 
correlations. 

By analyzing the finite size effects of synthetic images, we have concluded that the 
value of (3 is an essential feature of each natural image, which cannot be regarded as 
a fluctuation from a universal value. The generative model was also used to illustrate 
the meaning of the parameters (3 and s. 

The wavelet used in the generative model studied in this work was somewhat 
arbitrary. A better way to represent the multiscaling properties of natural images is 
to derive a wavelet basis from the image data set itself. When this is done, it is found 
that the coefficients of the wavelet expansion are scale-independent and weakly space- 
dependent, thus providing a quasi-optimal code for the image p2| . Their distribution 



for a given image is log-Poisson (eq. (^)), with exactly the same parameter (3 that 
would be obtained from the statistics of e r . This is a remarkable result, and supports 
our proposal that the visual system has adapted to the multiscaling properties of 
natural scenes and that its architecture contains, in particular, information about the 
log-Poisson multiplicative process. 

Further equalization of these quasi-independent coefficients that follow the dis- 
tribution given in eq. (||) could be an advantageous strategy for the visual system 
since it optimizes the mutual information between the source (visual stimuli) and 
the internal representation |34], |35| . Since the probability distributions of the quasi- 
independent coefficients depend on (3 this proposed equalization depends on the image 
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being processed. 

To conclude, in this work we have established that the multiscaling properties 
of changes in contrast are a robust feature in natural images. The existence of this 
statistical regularity for single images has several implications for visual processing. 
This result is relevant because it implies a reduction of the entropy of the distribution 
probability of the ensemble of natural images. As a new statistical regularity is found, 
the set of images that can be considered as natural becomes more restricted. The fact 
that this property is present not only in ensembles but also in single images greatly 
reduces the entropy of the ensemble of natural images. Therefore its existence should 
be taken into account when modeling visual processing in the brain. 
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Figure 4: Test of the log-Poisson model for the 400 images. la) The mean of the 
p'(p, 2)'s over the 400 images. lb) The mean of the p(p, 2)'s for the 375 best images. 
Again, for a) and b) the diamonds represent the mean of p'(p, 2) and the error bars 
are twice the standard deviation. The dashed line represents p°' 5 {p, 2). Ha) Histogram 
of values of j3 for the 375 selected images. lib) Histogram of values of D^ for the 290 
images with Doo in the range [0:2] 
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Figure 5: Test of SS and ESS for the whole image data set (400 images). I) SS test for 
the moments of order two (a) and seven (b). The dashed line represents f(x) = x. II) 
ESS test for the moments of order three (a) and seven (b). The dotted line represents 

f(y')=y'- 
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Figure 6: Test of the log-Poisson model for the whole data set. 




Figure 7: log-log plot of the power spectrum along the a) horizontal and b) vertical 
directions for a 512 x 512 image produced with the generative model. The straight 
lines correspond to l// 2 fits; the horizontal axes are given in cycles per image. 
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Table 3: Catalogue of synthetic images. The case (5 
images are like those for s = (see text) 
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Figure 8: Generative model. Histogram of (3 for the 32 256 x 256 synthetic images 
(and two directions) with a theoretical (3 = 0.40 
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Figure 9: Test of the log-Poisson multiplicative process for the generative model, a) 
p(jp, 2) for the averaged moments of the whole ensemble of 32 256 x 256 images. The 
dashed line corresponds to the ESS exponents predicted by the log-Poisson model for 
(3 = 0.38. b) The mean of p'(p, 2) over the 32 synthetic images and the two directions, 
where now p'(p, 2) has been normalized to /3 — 0.38. The dashed line represents 
p 0,38 (p, 2), the log-Poisson p(p, 2) with f3 = 0.38. The error bars are of the size of the 
data points. 
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